Advanced Connection Parameters¶
Prequisites¶
In the quickstart guide, we covered the basic utilities required to run a python function on a remote machine using a URL
.
The previous tutorials covered machine definitions using BaseComputer
.
As this is a subclass of URL
, you can further enhance your workflows with the extra functionalities provided.
Creating a Connection¶
Assuming we have a machine which we can ssh
into without issues, lets create a URL
:
[2]:
from remotemanager import URL
# we will use a localhost connection to ensure compatibility for this tutorial
connection = URL(host='localhost')
We now have the concept of a “connection” to a remote machine. Though here we are using a simple “localhost” connection, remember that URL is able (and intended to) connect outside of your current workstation by specifying a connection to that machine.
A simple rule-of-thumb is to use URL(<remote>)
where <remote>
is whatever you would use for a ssh <remote> ...
command.
See the relevant Quickstart section for more info.
Tip
You can quickly check your connection any time by issuing a command on the machine with connection.cmd('...')
. pwd
and/or ls
would also likely alert you to if you’re connected to the right machine or not.
Testing your Connection¶
Added in version 0.5.10.
url
also provides a test_connection()
method which will attempt to connect to the remote and run a test suite. The results of this test are a strong indicator of whether remotemanager
can run jobs on your machine.
[3]:
connection.test_connection()
Checking for entry point... Success (/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials)
Checking file creation in home... True
Checking file creation in /tmp... True
Checking file creation in /scratch... False
Testing remotemanager.transport.rsync:
send... Transferring 3 Files... Done
True
pull... Transferring 1 File... Done
True
Testing remotemanager.transport.scp:
send... Transferring 3 Files... Done
True
pull... Transferring 1 File... Done
True
Cleaning up... Done
Done! Made 15 calls, taking 0.19s
Approximate latency, 0.01s
Tests passed successfully
The results are returned in dictionary format and can be queried here, or from the URL
itself:
test_connection
creates a ConnectionTest
object and runs the contained tests. Within, we test the minimal required functionality for running jobs:
connection to the remote
creation of a file in at least the home dir
functional transport system
Provided these three conditions are true, the test will evaluate as passed.
[4]:
connection.connection_test.passed
[4]:
True
Some extra parameters are checked, and stored within the data
(which stores useful info), and extra
(which stores errors and minor details). You can query these if you are interested in their content.
The test also runs some basic timing checks and calculates a very rough latency:
[5]:
print(connection.connection_test.latency)
print(connection.latency) # this is also available from the root URL object
0.012922207514444986
0.012922207514444986
Ping¶
Added in version 0.5.9.
URL
also provides a ping()
method, which will attempt to run the ping
command on your system, targeting the remote. This takes the arguments n
, waiting for n
returns from the remote (defaults to 5), and timeout
, which limits the total duration (defaults to 30s).
This method will return the delay in ms as a float.
URL.cmd¶
URL is a powerful interface between python and your remote system, the method that will likely be most used in your workflows is URL.cmd()
.
It has been mentioned occasionally in earlier tutorials, so here we will go through some of the more specialised features.
[6]:
connection.cmd('echo "this command is executed on the remote"')
[6]:
this command is executed on the remote
Internally URL
creates a CMD
object, and then executes the command, adding the appropriate ssh
in order to operate over the network. We will see later how this can be fine tuned.
Error handling¶
By default, cmd will raise any errors encountered. If cmd detects anything on stderr
(that isn’t an empty string), it will be raised as a RuntimeError:
[7]:
connection.cmd('do a thing')
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[7], line 1
----> 1 connection.cmd('do a thing')
File ~/Work/Devel/remotemanager/remotemanager/connection/url.py:729, in URL.cmd(self, cmd, asynchronous, local, stdout, stderr, timeout, max_timeouts, raise_errors, dry_run, prepend, force_file, landing_dir, stream, verbose)
726 if dry_run:
727 return thiscmd
--> 729 thiscmd.exec()
730 if not local:
731 self._callcount += 1
File ~/Work/Devel/remotemanager/remotemanager/connection/cmd.py:369, in CMD.exec(self, verbose)
366 return self._fexec(stdout, stderr, verbose)
368 try:
--> 369 self._exec(stdout, stderr, verbose)
370 except OSError as E:
371 msg = "Encountered an OSError on exec, attempting file exec"
File ~/Work/Devel/remotemanager/remotemanager/connection/cmd.py:469, in CMD._exec(self, stdout, stderr, verbose)
467 if not self._async and not self.is_redirected:
468 logger.debug("in-exec communication triggered")
--> 469 self.communicate(verbose=verbose)
File ~/Work/Devel/remotemanager/remotemanager/connection/cmd.py:600, in CMD.communicate(self, use_cache, ignore_errors, verbose)
598 logger.warning("locale error detected: %s", err)
599 else:
--> 600 raise RuntimeError(f"received the following stderr: \n{err}")
602 self._stdout = _clean_output(std)
603 self._stderr = _clean_output(err)
RuntimeError: received the following stderr:
/bin/bash: -c: line 1: syntax error near unexpected token `do'
/bin/bash: -c: line 1: `do a thing'
Spurious Errors¶
Some systems can place non-critical warnings onto stderr
, which can cause otherwise perfectly functional workflows to think they have failed. If this is the case, you can use raise_errors=False
.
Note
If you have a situation where a machine is raising non-fatal errors, raise_errors=False
can be passed to the actual URL
, which sets and cmd
call to ignore errors by default.
The CMD Object¶
URL.cmd
also returns a CMD
object, which can be stored and queried.
The most useful properties are stdout
and stderr
which allow access to these attributes after a call.
Note
If a direct CMD
call is important, it is advisable to capture it within a variable (such as below). URL
does keep a history, but it is limited in size.
[8]:
output = connection.cmd('do a thing', raise_errors=False)
[9]:
print('cmd stdout:', output.stdout)
print('cmd stderr:', output.stderr)
cmd stdout:
cmd stderr: /bin/bash: -c: line 1: syntax error near unexpected token `do'
/bin/bash: -c: line 1: `do a thing'
There are other useful attributes attached to this object, which may assist in your workflows, or debugging:
[10]:
print('Shell process id is:', output.pid)
print('Working dir of the call is:', output.pwd)
print('The command that was sent is:', output.sent)
print('User that executed the cmd:', output.whoami)
print('You also have access to the returned code:', output.returncode)
Shell process id is: 42294
Working dir of the call is: /home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
The command that was sent is: do a thing
User that executed the cmd: ljbeal
You also have access to the returned code: 2
You can also use the output.kill()
method, which will attempt to terminate the process.
CMD History¶
Added in version 0.6.1.
URL
captures your most recent cmd
calls within a cmd_history
property.
This has a fixed length set by url.cmd_history_depth
(defaults to 10).
This is useful for debugging unexpected results from a call which was not captured within a variable.
[11]:
print(connection.cmd('pwd'))
print(connection.cmd_history[-1])
/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
[12]:
print(type(connection.cmd_history[-1]))
<class 'remotemanager.connection.cmd.CMD'>
Async calls¶
Up until now we have been calling commands sequentially and waiting for the result, however it’s possible to launch a command and proceed without waiting.
Below we have 2 structures that issue a command that waits for 3s, then returns the string “finished!”
We will time how long the execution takes, and how long it takes to get back the result:
[14]:
import time
t0 = time.time()
output1 = connection.cmd('sleep 3 && echo "finished!"')
t1 = time.time()
dt = int(round(t1 - t0))
print(f'call took ~{dt}s')
print(output1)
t2 = time.time()
dt = int(round(t2 - t1))
print(f'collecting the results took ~{dt}s')
call took ~3s
finished!
collecting the results took ~0s
[15]:
t0 = time.time()
output2 = connection.cmd('sleep 3 && echo "finished!"', asynchronous=True)
t1 = time.time()
dt = int(round(t1 - t0))
print(f'call took ~{dt}s')
print(output2)
t2 = time.time()
dt = int(round(t2 - t1))
print(f'collecting the results took ~{dt}s')
call took ~0s
finished!
collecting the results took ~3s
As we can see, the first call waits for completion, returning the result. The second call however skips this waiting phase, and we don’t actually have to wait for the command to execute until after we request the result
Fine Tuning a cmd
Call¶
URL has some further options available which may enhance your workflows in a remote setting. We have already seen the asynchronous
argument, lets look at a few more.
To show these more in depth systems, we shall create a “dummy” connection:
[16]:
dummy = URL(user='username', host='remote.connection.address')
Dry Run¶
If you’re about to issue a command which could be potentially destructive (or time intensive), it is wise to check that it actually looks sensible.
dry_run
does just this. Instead of executing the command on the remote, it will simply return what it would excecute as a string.
[17]:
dummy.cmd('echo "this call will just be returned as a string"', dry_run=True)
[17]:
ssh -p 22 -q username@remote.connection.address 'echo "this call will just be returned as a string"'
Local¶
A useful flag which you may need to use, is local
.
This allows you to run commands on your local machine, even using a URL
that is pointed at a remote. See the change in command here:
[18]:
dummy.cmd('echo "this call will just be returned as a string"', local=True, dry_run=True)
[18]:
echo "this call will just be returned as a string"
As this command skips over the remote portion, we don’t actually need the dry_run
here.
[19]:
dummy.cmd('echo "this call will just be returned as a string"', local=True)
[19]:
this call will just be returned as a string
Forcing a file-type execution¶
The internal CMD
object has a special run-mode where it will first dump the cmd to a file, then execute that file with bash
.
Normally this is used as a backup for a situation where the cmd can fail to execute. However you can force this behaviour by passing force_file=True
.
Once this cmd is communicated with, the file will be cleared from the system, so we need to use asynchronous=True
to prevent this.
[20]:
file_cmd = dummy.cmd('echo "this call will just be returned as a string"', local=True, force_file=True, asynchronous=True)
The filename is stored temporarily in the redirect
attribute of the resulting CMD
object.
[21]:
tempfile = file_cmd.redirect["execfile"]
print(tempfile)
221971b2.sh
[22]:
with open(tempfile) as o:
print(o.read())
echo "this call will just be returned as a string"
Now if we access the result of the call, it will attempt to communicate with the process, removing the file as it does so.
[23]:
print(file_cmd.stdout)
this call will just be returned as a string
Global CMD Parameters¶
The remaining options can be set at the URL
level (not just on the cmd()
call), so we’ll demonstrate them there.
Options set this way then apply to all cmd calls issued by that URL
. (Though any args passed to the cmd()
method will override them)
Timeout Parameters¶
Each call to the remote will attempt to gracefully handle a timeout. In the case of a slow connection, a timeout will occur after timeout
seconds. The operation of this is as follows:
If a connection takes longer than
timeout
seconds to respond, it will issue an internal timeout error.CMD
will then wait fortimeout
seconds, then retry.If the attempt fails again,
CMD
will wait forn
*timeout
and repeat, wheren
is the number of current failures + 1.This continues until
max_timeouts
is reached, when aRuntimeError
will be raised instead.
timeout
defaults to 5s and max_timeouts
defaults to 3 attempts
Note
This occurs on the communicate
side of a CMD
exec, so an asynchronous
call will not see this until you try to access the output (or trigger communicate
in another way).
[24]:
dummy = URL(user='username', host='remote.connection.address', timeout=10, max_timeouts=5)
[25]:
print(dummy.timeout)
10
[26]:
print(dummy.max_timeouts)
5
Added in version 0.13.4.
Note
You can now disable the timeout function by setting timeout
to 0
, a negative number or False
.
Landing Directory¶
Added in version 0.9.19.
By default, a URL.cmd
will “land” in the default directory that a standard ssh
would. This can be configured via the landing_dir
argument.
To demonstrate this, we will have to hop back over to a functional URL
.
(It will also be helpful to create a directory to land in using our main URL
, connection
.
[27]:
connection.cmd("mkdir -p inner_directory")
print("initial landing dir:", connection.cmd('pwd'))
print("updated landing dir:", connection.cmd('pwd', landing_dir="inner_directory"))
initial landing dir: /home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials
updated landing dir: /home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials/inner_directory
Now lets do this at the URL
level.
[28]:
url = URL(landing_dir="inner_directory")
url.cmd("pwd")
[28]:
/home/ljbeal/Work/Devel/remotemanager/docs/source/tutorials/inner_directory
Editing the ssh string¶
URL has an ssh
property which will return the string that allows interfacing with the remote. If this needs updating for whatever reason it can be overridden by simply setting the attribute. The example below is used to remove a locale error that can occur on some systems.
Added in version 0.6.0: This specific update is no longer needed, as the locale errors are ignored by default. However the functionality of modifying your ssh
remains.
[29]:
print('initial ssh string:', dummy.ssh)
dummy.ssh = 'LANG=C ' + dummy.ssh
print('updated ssh string:', dummy.ssh)
initial ssh string: ssh -p 22 -q username@remote.connection.address
updated ssh string: LANG=C ssh -p 22 -q username@remote.connection.address
To undo this change, set ssh
to None
, or call url.clear_ssh_override()
[30]:
dummy.ssh = None
dummy.clear_ssh_override()
print('the reverted ssh string is', dummy.ssh)
the reverted ssh string is ssh -p 22 -q username@remote.connection.address
URL.utils¶
URL also provides a utils
module which provides both commonly used functions, and more complex ones. First of these is the mkdir
and touch
methods, which will create a dir and file with the given path, respectively
[31]:
test_mtime = int(time.time())
connection.utils.mkdir('temp_utils_test')
connection.utils.touch('temp_utils_test/create_me')
connection.utils.touch('temp_utils_test/create_me_also')
[31]:
There is also utils.ls
, which returns the files as a list by default
[32]:
connection.utils.ls('temp_utils_test')
[32]:
['create_me', 'create_me_also']
The more powerful functions granted by utils is the search_folder
, file_presence
and file_mtime
methods.
These methods allow searching for a list of files, condensing the the query down to a single call in each case. This is useful for high latency remote systems, where an ls
search for 100+ files could take a long time. These functions will do this in a single call.
search_folder
takes a list of files and a folder, returning a {file: bool} “truth-dict” of whether those files are present
[33]:
connection.utils.search_folder(['create_me', 'not_present'], 'temp_utils_test')
[33]:
{'create_me': True, 'not_present': False}
Similarly, a more general form exists in file_presence
, which will take a list of files and return a similar truth-dict of their ls
presence
[34]:
connection.utils.file_presence(['temp_utils_test/create_me',
'missing_folder/file',
'temp_utils_test/not_present'])
[34]:
{'temp_utils_test/create_me': True,
'missing_folder/file': False,
'temp_utils_test/not_present': False}
If the file modification time is what you want, then file_mtime
can be used in a similar way. This is the method called internally in file_presence
, so incurrs no extra runtime
[35]:
times = connection.utils.file_mtime(['temp_utils_test/create_me',
'temp_utils_test/create_me_also',
'temp_utils_test/not_present'])
print(times)
{'temp_utils_test/create_me': 1738078661, 'temp_utils_test/create_me_also': 1738078661, 'temp_utils_test/not_present': None}
Tunnels¶
ssh tunnels allow a peristent connection to a machine. You could create a tunnel to a machine hosting a jupyter instance to access it locally, for example.
Lets demonstrate what that looks like.
[37]:
remote_url = URL("remote.host")
remote_url.cmd("jupyter lab --ip=0.0.0.0", dry_run=True)
[37]:
ssh -p 22 -q remote.host 'jupyter lab --ip=0.0.0.0'
Note
The --ip=0.0.0.0
modifier is required to allow external connections.
This would start a jupyter lab session with the base python. Note that this is a simplified example, in your case you will most likely need to follow a different procedure to start jupyter.
In any case, lets assume that the server is running on port 8888
(the default) on remote.host
.
Now we can create a tunnel to it. If we have (or want) any locally run servers, we now cannot reuse port 8888. So lets redirect to 9999.
[38]:
tunnel = remote_url.tunnel(local_port=9999, remote_port=8888, background=True, dry_run=True)
print(tunnel.cmd)
ssh -p 22 -q remote.host -N -L :9999:remote.host:8888 remote.host
This command creates a tunnel between your machine and the remote. The jupyter server will now be available at 127.0.0.1:9999
The local ip address can be changed by setting local_address
.
Important
remotemanager
will attempt to avoid leaving “dangling” tunnels open. To help with this, the PID of a non dry_run
tunnel will be reported. However assigning the tunnel to a variable is advised, allowing you to call the kill()
method.
Note
The with
context (below) is the preferred method of handling tunnels, if it is possible for your use case. This will handle the safe closure of the tunnel on your behalf, even in the case of an exception.
The with
context¶
It is possible to execute commands using the python with
context. This ensures that the process is properly killed if an exception occurs.
Note
This can be used to ensure that your tunnels are closed if you’re using them within a script.
We can demonstrate this by generating a long running async command and then causing a failure within the context.
[39]:
t0 = time.time()
with url.cmd("sleep 300 && echo 'foo'", asynchronous=True) as c:
print(f"my PID is {c.pid}")
raise RuntimeError("Raise an exception here to force the with(...) to exit")
my PID is 42323
Exiting context, killing pid 42323
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[39], line 5
3 with url.cmd("sleep 300 && echo 'foo'", asynchronous=True) as c:
4 print(f"my PID is {c.pid}")
----> 5 raise RuntimeError("Raise an exception here to force the with(...) to exit")
RuntimeError: Raise an exception here to force the with(...) to exit
[40]:
print(f"dt: {time.time() - t0:.2f}s")
dt: 0.02s
Note that the time to execute is significantly shorter than the sleep. We can check that the process no longer exists by querying the pid:
[41]:
import psutil
psutil.pid_exists(c.pid)
[41]:
False
ProxyJump¶
If you can connect with an ssh ...
command, you can do it using URL. An extension to this is ProxyJump
, which allows you to “hop” between hosts to get to your destination. If you have this set up, you will have an ssh config file that looks somewhat like this:
Host remote-endpoint
User username
Hostname remote.endpoint.address
ProxyJump remote-middleman
Host remote-middleman
User username
Hostname remote.middleman.address
The following URL is an example that would connect using these parameters:
[42]:
proxyurl = URL(host='remote-endpoint')
print(proxyurl.userhost)
remote-endpoint
[43]:
print(proxyurl.cmd('echo "test"', dry_run=True))
ssh -p 22 -q remote-endpoint 'echo "test"'
Setting a Default URL¶
The standard default URL
is one pointed at localhost
, this is in reality a safety measure to ensure that the Dataset
at least has a URL
. However this is not ideal for a remote workflow.
The default_url
is a property of Dataset
, and can be set at the object level after importing.
[44]:
from remotemanager import Dataset
def func(inp):
return inp
ds = Dataset(func, skip=False)
print(ds.url.userhost)
localhost
[45]:
url = URL("user@host")
Dataset.default_url = url
[46]:
ds = Dataset(func, skip=False)
print(ds.url.userhost)
user@host
I’m seeing errors that look like system messages¶
Added in version 0.10.15.
It is normal for machines to output information on connection. Usage, documentation, disk quotas, etc. Sometimes, however sometimes these messages can be emitted on stderr, rather than stdout. remotemanager
will see this and assume that something has gone wrong. To prevent this, by default all ssh calls use the -q
flag.
ssh -q
flag¶
This flag suppresses most errors and warnings, and should allow for smoother control of your machine. If you’re seeing strange behaviour with errors not being properly collected, you can either set
url.quiet_ssh = False
, or initialise URL with URL(..., quiet_ssh = False)